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Abstract 


A collaborative approach to software development is described. The approach 
employs the agile development techniques: project retrospectives, Scrum sta- 
tus meetings, and elements of Extreme Programming to efficiently develop a 
cohesive and extensible software suite. The software product under develop- 
ment is a fluid dynamics simulator for performing aerodynamic and aerother- 
modynamic analysis and design. The functionality of the software product 
is achieved both through the merging, with substantial rewrite, of separate 
legacy codes and the authorship of new routines. Examples of rapid im- 
plementation of new functionality demonstrate the benefits obtained with 
this agile software development process. The appendix contains a discus- 
sion of coding issues encountered while porting legacy Fortran 77 code to 
Fortran 95, software design principles, and a Fortran 95 coding stan- 
dard. 


Introduction 

The objective of the Fast Adaptive AeroSpace Tools (FAAST) program at 
NASA Langley Research Center is to develop fluid dynamic analysis and de- 
sign tools (ref. 1). The four primary elements in FAAST are CAD-to-Grid 
Methods, High Energy Flow Solver Synthesis (Hefss), Optimally Convergent 
Algorithms, and Efficient Adjoint Design Methods. This paper primarily fo- 
cuses on the software development practices adopted by the Hefss and design 
elements of FAAST. 

HEFSS aims to develop an unstructured-grid flow solver for hypersonic flow 
applications. This solver is to have the same chemical, thermal, and turbulence 
modeling capabilities as are presently available in the Langley structured-grid 
flow solvers LAURA (ref. 2) and Vulcan (ref. 3). The switch to unstructured- 
grid technologies is seen as an enabling capability to efficiently handle complex 
geometries and is synergistic with the CAD-to-Grid Methods’ unstructured- 
grid generation and adaptation efforts (refs. 4 and 5). This unstructured- 
grid hypersonic flow solver is to be obtained by extending the capabilities of 
the existing unstructured-grid transonic flow solver Fun3D (ref. 6). Factors 
related to the choice of Fun3D as the baseline unstructured-grid flow solver 
are discussed in appendix A on page 25. 

The complexity of Hefss exceeds that of prior fluid dynamic development 
projects at Langley and is the Erst tool developed by a sizable team. 1 The 
prior development of computational fluid dynamic (CFD) codes at Langley 
and at the former, co-located Institute for Computer Applications in Science 

1 See ONERA’s elsA project (ref. 7) and DLR’s MEGAFLOW project (ref. 8) for other 
examples of team CFD development. 
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Table 1. CFD Code, Architect, and Application Domain 


Cfl3D 

Rumsey/Biedron 

Structured-grid (SG) aerodynamics 

Laura 

Gnoffo 

SG hypersonic aerothermodynamics 

Vulcan 

White 

SG hypersonic propulsion 

Tlns3D 

Vatsa 

SG aerodynamics 

Overflow 

Buning 

Overset SG aerodynamics 

Usm3D 

Frink 

Unstructured- grid (UG) aerodynamics 

Nsu3D 

Mavriplis 

UG aerodynamics 

Fun3D 

Anderson 

UG aerodynamics and design 

Felisa 

Peraire 

UG hypersonic aerodynamics 


and Engineering (ICASE) has consisted of one or two people working focused 
applications or algorithms. Even when more people contributed to the devel- 
opment of a code, one person contributed the bulk of the code and served as 
gatekeeper for any changes. Examples of this paradigm are listed in table 1. 
There is overlap in the capabilities of these codes because their development 
processes were all sufficiently rigid as to make it cheaper to develop an inde- 
pendent code with new functionality rather than to extend an existing code. 
HEFSS aims to break this cycle by developing an extensible product. 

Heavyweight software engineering processes, 2 which accommodate teams 
of tens of hundreds of programmers working with a relatively well-defined set 
of requirements, were initially considered but rejected as being too restrictive 
for a research team of about 10 people. However, the emerging agile soft- 
ware development movement 3 is perceived to be well suited to the uncertain 
requirements and size of teams typically present in a research environment. 
The agile movement views software development as an empirical, rather than 
defined, process (ref. 9). To manage the empirical process, agile methods in- 
corporate rapid feedback mechanisms to enable constant steering and place a 
renewed emphasis on the heart of software development — software craftsman- 
ship (ref. 10). 

Making the switch from a one-code, one-developer paradigm to a team- 
based approach is a significant culture change, but the ambitious goals of 
HEFSS provided a strong motivation to look past skepticism and overcome 
resistance to change. The scope of Hefss required a group of developers 
(initially 12 people at 25 to 100 percent work levels) with diverse areas of 
expertise to collaborate on the software. The 18-month milestone for the 
project was to demonstrate the synthesis of the structured-grid physical models 
on a cylinder case by using an unstructured-grid discretization. In addition, 
the existing functionality of the baseline Fun3D code was to be maintained 

2 See www.sei.cmu.edu/cmm for example. 

3 See www.agilealliance.org. 
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within the Hefss code base. To compound matters, there was a lack of team 
software development expertise. This critical ignorance was overcome through 
consultant-led workshops, an invited lecturer series, a support contractor, and 
two team members aggressively pursuing software development best practices 
training (refs. 11 and 12). 

This paper documents how the HEFSS team adapted and incorporated agile 
software development practices to develop the next generation CFD applica- 
tion software. No claims are made that the correct processes were choosen 
or that the current processes have fully matured. It is difficult to objectively 
gauge the HEFSS development process, but the project is ongoing, morale is 
high, its practices have been adopted by other teams, it was included in a 
group achievement award, and the local software engineering process group is 
using it as a model. The experience and lessons learned are offered as a case 
study, which may be useful to others with similar backgrounds and goals. 


Team Obstacles 

Before embarking on a discussion of team software development, a synop- 
sis of typical barriers to team formation and viability is necessary. Without 
providing a fertile team environment, the collaborative software development 
techniques laid out in the next section will not work. 

According to reference 13, a true team can be identified by its low turnover, 
its strong sense of identity, its sense of eliteness, its joint ownership of the prod- 
uct, and its members deriving more enjoyment from their work than expected 
from the nature of the work itself. Unfortunately, while there is an array 
of team-building techniques available, there is no simple recipe for creating 
cohesive teams. 

Meanwhile, methods known to destroy teams are well documented. Briefly, 
barriers to team formation and ongoing viability include these: lack of trust, 
the promotion system, defensive management, bureaucracy, physical separa- 
tion, fragmentation of people’s time, quality reduction of the product, phony 
deadlines, and clique control. For expanded discussion of these issues, consult 
references 13 and 14. 


Collaborative Software Development 

CFD software development at Langley has traditionally been performed in 
a rather unconstrained, self-governed environment. As mentioned earlier, 
most codes have typically been developed by one, or perhaps two researchers. 
This paradigm has worked relatively well and has produced software packages 
widely used by industry and academia. 
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Unfortunately, such software development strategies often result in codes 
that are complex and burdensome to maintain, and frequently subsequent 
working groups produce distinct versions of the code that are often incompat- 
ible with each other and previously released versions. Moreover, cohesiveness 
and portability are typically lost, as additional researchers contribute to the 
code, using their own coding style and practices. 

In contrast to this ad hoc approach to code development, the HEFSS team 
sought to incorporate the software industry’s best practices, not only because 
of the challenges of working as a cohesive team, but also to find methods 
which would extend the life cycle of the new code. Everyone on the team had 
experienced the pain of adding new capability to a large, existing code which 
was developed in an ad hoc manner. Even a seemingly innocuous bug fix was 
unnerving because there was no repeatable method to discover whether the 
fix would break existing capability in some subtle manner. 

A survey of industry best practices for software development was con- 
ducted, which included sponsoring a local ICASE lecture series entitled “Mod- 
ern Programming Practices.” 4 Meanwhile, two pathfinder projects were con- 
ducted to gain hands-on experience. Detailed discussion and extensive refer- 
ence lists are available in references 11 and 12. 

As described earlier, the emerging body of agile software development 
methodologies was determined to have the best fit with the inconstant na- 
ture of a scientific research environment. Specifically, Extreme Programming 
(XP) (ref. 15) appeared to be the most mature, although at the time, docu- 
mentation was limited to a few websites. 5 In addition, recent experience with 
ISO 9001 edicts tended to steer the team away from defined process man- 
agement techniques implicit in methodologies like the Capability Maturity 
Model® (ref. 16) and its associated Team Software Process (ref. 17). 

The collection of collaborative software development practices described 
herein evolved from weekly meetings in which the challenges and possible so- 
lutions were discussed. Issues discussed cover fresh start versus retrofit ver- 
sus restructuring of existing code, language selection, coding standards, mod- 
ularization and maintainability versus efficiency, acceptance testing, source 
code management etiquette, 6 and documentation. As the HEFSS team ini- 
tially struggled with and then embraced new software development practices, 
other teams (CAD-to-Grid, Design Optimization) within the FAAST project 
adopted many of the same practices. 

Specific software development techniques are discussed in the following 
sections, namely: XP, project retrospectives, status meetings, other commu- 
nication mechanisms, and documentation. 

4 See www.icase.edu/series/MPP 

5 www. c2.com and www.extremeprogramming.org. 

6 Source code management etiquette — when source code should and may be committed 
to a common repository. 
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Extreme Programming 

XP is founded on four values: communication, simplicity, feedback, and courage 
It was designed to keep the right communications flowing by employing many 
practices that cannot be done without communicating. XP also gambles that 
it is better to do a simple thing today and pay a little more tomorrow for any 
necessary changes than to do a more complicated thing today that may never 
be used; that is, in this universe one cannot “save time.” Meanwhile, XP’s 
feedback mechanisms cover many time scales since optimism is an occupa- 
tional hazard of programming and feedback is the treatment. Finally, courage 
enables one to escape local optima. 

Built from this value system, XP consists of 12 practices shown in table 2. 
Also shown in the table is the level to which the Hefss team has adopted each 


Table 2. Current Level of XP Adoption 


Practice 

Adoption 

Comments 

Sustainable pace 

Full 

No compulsory overtime. 

Metaphor 

Full 

Using naive metaphor, i.e., CFD jargon. 

Coding standards 

Full 

See appendix B on page 31. 

Collective ownership 

Full 

Anyone can change any piece of code. 

Continuous integration 

Full 

Automated build and test on three computer 
architectures. 

Small releases 

Partial 

A portion of code base is currently export re- 
stricted; seeking to relieve this constraint. 

Test-driven development 

Partial 

Fortran 90 unit test framework not widely 
used; however, Ruby codes are typically created 
using TDD. 

Refactoring 

Partial 

Performed, but not mercilessly, due to lack of 
unit test coverage. 

Simple design 

Partial 

Upfront, complex design is hard to resist, es- 
pecially without strong test-driven development 
and refactoring. 

Pair programming 

Partial 

Practiced, but not exclusively. 

On-site customer 

Partial 

No outside customer is providing a business per- 
spective, currently self-serving as customer for 
research products at hand. 

Planning game 

None 

Have yet to invoke project management side of 
XP. 


practice. The ensuing sections serve to briefly describe each practice and also 
to describe a practice in the context of the Hefss team. Adjacent to the start 
of each section are quotations from reference 15. 
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Guide all 
development with a 
simple shared story 
of how the whole 
system works. 


Programmers write 
all code in 
accordance with 
rules emphasizing 
communication 
through the code. 


Sustainable Pace 

Formally known as “40-hour week,” the sustainable pace practice probably 
ranks the highest on the common sense scale, but it is also the most frequently 
violated by managers and developers alike. Since the majority of the research 
conducted with the HEFSS project is years from commercial use, compulsory 
overtime is simply not part of the working environment. 

Metaphor 

Employing a system metaphor that all participants can understand facilitates 
communication both within the code and within the team. Since all the team 
members are familiar with CFD jargon, the naive metaphor is used. 

Coding Standards 

Coding standards are usually dreaded and met with resistance because they 
are seen as adding a superfluous burden. After a brief discussion of the genesis 
of Hefss’s coding standard, several reasons are provided to demonstrate why 
a coding standard is not only necessary but actually quite beneficial for a team 
software development project. 

During the transition of legacy code from Fortran 77 to Fortran 95, 
a rough guess at a coding standard was created and used by the entire team. 
Based on this experience, a more detailed revision was created. (See ap- 
pendix B on page 31.) One duty of the full-time contractor assigned to the 
team is to enforce the coding standard as new content is committed to the 
repository. This function is slated to be replaced by an automated agent that 
parses the source code. 

Given a thoughtfully crafted coding standard, improved source code read- 
ability is a natural benefit through consistent indentation, alignment, naming, 
and commenting conventions. However, the coding standard must be appro- 
priately tailored to the programming language. For example, FORTRAN 95 
permits declaring an array variable and later dimensioning it through a sepa- 
rate statement. This multiline variable declaration can be hard to follow and 
can create confusion, thus prompting a line in the coding standard to place 
all attributes of the declaration on a single line, if possible. Another example 
is that the variable names of arguments in the calling and called routines do 
not have to match. However, retaining the same names for both improves 
global comprehension of the code and makes code-generated documentation 
more coherent. 

A coding standard also serves as a sentinel against the use of vendor-specific 
language extensions or depreciated elements of the language that do not lend 
themselves to portability across various platforms. For example, FORTRAN 95 
does not contain a complete set of intrinsic functions for accessing system-level 


Productivity does 
not increase with 
hours worked; tired 
programmers are 
less productive 
than well-rested 
ones. 
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utilities or timing, but many compiler vendors offer extensions like system () 
and etimeO, which are tempting but create portability headaches. 

Collective Ownership 

The ideal situation for team software development occurs when a pair of de- 
velopers looks at a given piece of code and does not feel the need to change the 
indentation, and so forth, and furthermore cannot recall whether they wrote 
the code in the first place. No single developer claims code ownership, yet 
all share responsibility; all source code is eligible for changes by any team 
member. Using a coding standard is absolutely essential to reach this goal. 

Collective code ownership was a completely foreign concept to team mem- 
bers prior to this project. Initial acceptance of this philosophy came about 
because the original developer of the Fun 2D/3D code was no longer at Lang- 
ley, and the current “code steward” did not feel comfortable claiming the code 
as “his.” Both the software development practices mentioned above and the 
tools the team use for effective collaboration have cemented the idea of collec- 
tive code ownership to the extent that members feel comfortable changing the 
code without asking permission of another developer. 

Due to the team-oriented nature of the project and the amount of source 
code involved, a widely used source code management system is used, the 
Concurrent Versions System (CVS). 7 CVS oversees a central repository of the 
source code and allows each team member to concurrently develop and modify 
sections as needed. Any changes or additions to a local working copy can then 
be committed to the repository, whereby they will be available to the entire 
team. 

CVS maintains complete documentation of any changes made during the 
course of code development, and previously modified or deleted code can be 
resurrected at any time by any member of the team. In addition, the system 
allows team members to work on platforms located virtually anywhere. The 
use of a software management tool allows for nearly seamless integration of a 
number of widely varying research projects and eliminates the need for multiple 
branches of a code. 8 

Continuous Integration 

In a team environment that has many developers who all contribute to a code 
base on a daily basis, integrating those changes into a common code base 
quickly becomes a major undertaking unless new code is integrated and tested 
as soon as practical, preferably within a few hours. 

7 www . evshome . org 

s This CVS controlled DT^X document was jointly composed by the team using such an 
approach. 


Anyone can change 
any code anywhere 
in the system at 
any time. 


Integrate and build 
the system many 
times a day, every 
time a task is 
completed. 
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Put a simple 
system into 
production quickly, 
then release new 
versions on a very 
short cycle. 


Continuous integration avoids diverging or fragmented development efforts, 
in which developers are not communicating with each other about what can 
be shared or reused. Simply stated, everyone needs to work with the latest 
version of the code base. Making changes to obsolete code causes integration 
headaches. 

Originally, developers manually ran the Hefss test suite during code mod- 
ification, but not all developers consistently ran the test suite before checking 
their code modifications into the repository, so an automated process was 
sought. At first the Unix-based cron utility was used to check out a fresh 
version of the CVS repository, to compile the suite of codes, and to run regres- 
sion tests on three different architectures and compilers every night. However, 
the Extreme Programming community soon reminded the HEFSS team that 
“[daily builds] are for winning-challenged people who can’t integrate every 5 
to 15 minutes and run all the tests at every integration,” and went to a true 
continuous integration mode of operation on dedicated machines. 

The continuous integration process restarts the build and test process after 
each successful set of tests. Test results are automatically logged on a web 
server, and failures are e-mailed to all developers listing all CVS commits that 
were performed since the last successful build. With this system, errors are 
detected within a couple hours, and the integration failure e-mail provides a 
strong source of peer pressure on developers to run a range of tests before 
committing changes. 9 


Small Releases 

Feedback is the core idea behind the small releases practice. Get the soft- 
ware out there and learn from it. Strive to make the transition from pure 
software development to software maintenance as quickly as possible. Small 
releases are enabled by other practices like simple design, automated testing, 
and continuous integration. 

The source code management system described previously enables the team 
to automatically create releases by merely “tagging” snapshots of the reposi- 
tory for which all the tests pass successfully during the continuous integration 
cycle. Routinely, the team typically makes several releases throughout any 
given day. This snapshot feature also facilitates the management of releases 
to outside users by providing accurate technical support tailored specifically 
to the exact source code snapshot released to a given party. Unfortunately, 
the HEFSS code currently has some restrictions on its external distribution; 
however, it is being used in-house by several people (ref. 18). 


9 See www.martinfowler.com/articles/continuousIntegration.html for more information. 


Test-Driven Development 

Since the time to fix a software defect (aka “bug”) scales exponentially with 
the time lag between introduction and detection (ref. 19), it is extremely ad- 
vantageous to trap defects as early as possible during development. 

Previously known merely as “Testing,” this practice has blossomed into 
a whole field in itself (ref. 20). Test-driven development within XP has two 
components, one centered around developers and the other centered around 
customers, or end users. Developers write unit tests so that their confidence in 
the code can become part of the code itself, while customers write acceptance 
tests so that their confidence in the code’s capabilities can also become part 
of the code. These automated tests allow confidence in the code to grow over 
time, allowing the code to become more capable of accepting change. 

Unit tests are intended to verify small quanta of functionality within a code 
and should be automated and run to completion in fractions of a second. The 
unit tests serve as a development guide by specifying the desired capability, 
interfaces, and expected output of a functional unit. Unit tests also serve as 
mobility enablers during code architecture shifts to ensure a safe path was 
taken. Mobility allows code to be easily reused and to have functionality 
extended while safely maintaining current functionality. Note that in most 
cases, there will be more lines of unit test code than actual production code. 

Acceptance tests check the interactions between code elements that unit 
tests cannot cover and document the existence of a particular code feature. 
Preferably, customers write acceptance tests. 

Since the HEFSS code contains active research in many different disciplines 
that coexist in the same framework, work in one field can introduce errors 
in others through the common framework. These errors can go unnoticed if 
the code, in part and in whole, is not verified in a repeatable manner. One 
well-known approach to finding defects and ensuring that the code produces 
repeatable, verified answers is through automated testing. 

For example, an unforeseen interaction with module A is introduced by 
modifying code in module B. If the problem in module A goes undetected for 
a month, it may be difficult to link the problem to an interaction with module 
B or to other code modifications made during that month. If the problem 
in module A is detected in minutes by an automated testing framework, the 
interaction of module A and module B can be clearly identified before other 
code modifications cloud the picture. 

The current project began with legacy code that did not contain a single 
unit test. Because retrofitting an exhaustive set of unit tests to the existing 
legacy code was deemed too expensive, the original intent was to introduce unit 
tests as new code was added and old code was refactored. To date, however, 
unit testing has not been widely adopted by the team despite the creation of 


Programmers 
continually write 
unit tests, which 
must run flawlessly 
for development to 
continue. 

Customers write 
tests 

demonstrating that 
features are 
finished. Any 
program feature 
without an 
automated test 
simply doesn’t 
exist. 


9 


a unit testing framework for FORTRAN 95. 10 Currently, unit tests only cover 
a very small percentage of the code base. However, significant unit testing 
coverage is being built into Ruby-based wrappers used for testing and grid 
adaptation. Additionally, some of the low-level Fortran library routines are 
becoming test-infected, for example, character-to- number conversion routines 
and linear algebra routines. 

The acceptance tests for the Hefss code are a suite of over 240 regression 
tests performed by a series of Makefiles. These regression tests simply compare 
the convergence history of residual, force, and moment calculations (or other 
output appropriate to the code under test) to previously recorded executions 
to machine precision (not just 2-3 digits). These results are referred to as 
“golden files.” These test fixtures ensure that the current code gives the same 
discrete answer as the original golden hie. Makefiles were initially selected to 
perform these tests because the tests were seen as a natural extension to code 
compilation. * 11 The compile operations were incorporated into the tests, so the 
tests are always performed with an executable hie produced from the current 
source hies. Test cases can be run on an individual basis or as an entire suite. 

The current set of acceptance tests for HEFSS was added incrementally to 
hrst cover the legacy functionality of Fun3D and then new functionality, as 
it was added to the suite. The Makehles that perform the tests have become 
complex, hard to maintain, and are being replaced in an incremental fashion 
with unit-tested Ruby. This unit-tested Ruby framework should be much 
easier to maintain and allow more hexibility. The Ruby framework can be 
reused to link a number of the codes together to perform complex functions 
such as design optimization and grid adaptation, in addition to testing. 


Programmers 
restructure the 
system without 
changing its 
behavior to remove 
duplication, 
improve 
communication, 
simplify, or add 
flexibility. 


Refactoring 

To extend a code’s viable lifetime and strive for the simplest design that will 
work, developers need lots of practice modifying the design, so that when the 
time comes to change the system, they will not be afraid to try it. Constant 
refactoring is absolutely essential to keeping the cost-of-change curve from 
growing exponentially as time increases. Reference 21 teaches developers how 
to refactor and why. 

Automated testing, as discussed earlier, is absolutely essential to refactor- 
ing. Without a safety net of tests, subtle shifts in the code’s fundamental ar- 
chitecture toward a more agile, clean, and understandable design is extremely 
difficult and frustrating. Testing allows developers to modify code that they 
did not write so that the original developer can be sure that modified routines 


10 To facilitate both the writing and running of unit tests for Fortran 95 source code, 
a testing framework called F95UNIT has been developed using Ruby. F95UNIT has a 
model similar to the unit-testing frameworks for other languages, e.g., JUnit, PyUnit, Ruby 
test /unit. 

11 If the code is modified and needs to be recompiled, it should also be tested. 
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still perform the original purpose correctly, if the appropriate unit tests pass. 
This process leads to an environment in which the tests are paramount and 
the code can be easily modified to add new functionality, improve speed, or 
become more readable. 

Due in part to the lack of extensive unit testing in the Hefss code, many 
refactorings are delayed, creating a backlog of work. Occasionally, the team 
will tackle some of these tasks, but so far the backlog continues to grow. A 
renewed effort at promoting the benefits of test-first programming is being 
made within the team by drawing attention to the inefficiencies inherent in 
the “Code-n-Fix” style of programming. 

Simple Design 

Simple design is defined by two ideas: One is the YAGNI principle, otherwise 
known as, “you aren’t gonna need it,” and the other is a chant, “do the simplest 
thing that could possibly work.” These principles should be internalized and 
provide instinctive reactions to “gold plating” or other ideas that do not seem 
to fit the current task. A simple design should not contain ideas that are not 
used yet but that are expected to be used in the future. However, one should 
pay attention to the word “expected.” If you are somehow assured of the 
future and that a given idea will be necessary, design with it in mind, but do 
not implement it now because you will best know how to add it when the time 
comes. 

As with refactoring, the lack of unit test coverage within Hefss code makes 
this practice difficult to follow completely. For many developers, it is also 
typically contrary to years of prior practice; regardless, the team can now at 
least recognize complexity, and several major strides have been made to reduce 
existing manifestations. 

Pair Programming 

The initial reaction to the idea of two people working on the same task at the 
same computer at the same time is usually negative. However, this reaction 
is typically caused by painful experiences associated with “pair debugging” 
or simply misunderstanding the true nature of pair programming itself. Pair 
programming is not one person programming while another person watches, ft 
is more akin to an animated conversation, facilitated by a white board, where 
one participant might grab the marker from the other and make a change while 
the first is still talking. Pair programming should be highly dynamic, and the 
participants should be able to switch “driver” and “navigator” roles at any 
point. Besides making programming more fun, pair programming provides an 
extensive host of benefits, such as streamlining communication, propagating 
knowledge, and continuous code reviews. Pair programming also greatly en- 
hances collective code ownership. For a detailed discussion of the art of pair 


The system should 
be designed as 
simply as possible 
at any given 
moment; extra 
complexity is 
removed as soon as 
it is discovered. 


All production 
code is written 
with two 
programmers at 
one machine. 
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programming, see reference 22. 

Within the HEFSS team, frequent pair programming is highly encouraged 
but not mandated. Figure 1 shows an example of a dedicated pair program- 



Figure 1. Pair programming station. 


ming station which includes adjustable task chairs, an adjustable table, and 
multiple styles of wireless keyboards and mice. Note: the dual screens are 
attached to a single computer and simply provide more desktop space. 12 How- 
ever, simply swapping a desk with a table is the essential step toward accom- 
modating pair programming. 

Pair programming is used for all aspects of code development, for example, 
debugging, teaching, refactoring, and adding new features. Intimately involv- 
ing a number of researchers at the lowest levels of code development ensures 
a relatively high truck number. 13 Traditional CFD codes at Langley are de- 
veloped by individuals or small teams and most of the resulting code base 

12 There is rumored to be a study which measured a 70 percent productivity increase for 
software developers by simply doubling the screen real estate. 

13 The truck number is the size of the smallest set of people in a project such that, if all of 
them got hit by a truck, the project would be in trouble. See c2.com/cgi/wiki?TruckNumber 
for further discussion. 
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has a truck number of 1 or perhaps 2, whereas the current collaborative team 
approach yields a value near 10. 


On-Site Customer 

This XP practice is intended to remove the communication barriers present 
in a typical contracted piece of software where a slew of requirements and 
specifications are defined up front and then the “code monkeys” are let loose 
to grind out the required piece of software. The pitfalls with this sort of 
contract negotiation are many, the least of which is that the customers seldom 
know what they want before they see a working prototype. By placing an end 
user with the team, XP is nearly guaranteed of delivering a relevant, useful 
piece of software. 

As discussed in reference 12, the scientific research environment often cre- 
ates a situation in which the developers are their own customers. This scenario 
requires diligent role playing to keep technical and business needs separated. 
Currently, the HEFSS team members largely act as their own customers, with 
only very minor input from project stakeholders. 


The Planning Game 

XP uses a four-dimensional space to plan and measure progress: time, cost, 
quality, and scope. Scope is typically ignored by many project-planning mech- 
anisms, but it plays a central role in XP. The planning game has two levels: 
iteration planning and release planning. The basic premise of the planning 
game is that business people determine scope, priority, composition of re- 
leases, and dates of releases, while technical people provide estimates, design 
consequences, the process, and detailed scheduling. 

As shown in table 2 on page 5, the Hefss team has not yet begun using 
this practice. However, full-cost accounting practices now being put into place 
may force this final XP practice to be invoked. 


Project Retrospectives 

Sometimes referred to as XP’s “thirteenth practice,” project retrospectives 
(ref. 23) are important components of tailoring a process to a given situation. 
Every few months, the team takes time to reflect on past events and accom- 
plishments. The goal is not faultfinding but learning how to do better in the 
future. During these sessions, the team begins with a discussion guided by 
the following three questions: what has gone well, what could be improved, 
and with what new techniques or tools should the team investigate? Cur- 
rently these sessions are not as formal or wide-reaching as some of the formats 
presented in reference 23. 


Include a real, live 
user on the team, 
available full-time 
to answer 
questions. 


Quickly determine 
the scope of the 
next release by 
combining business 
priorities and 
technical estimates; 
as reality overtakes 
the plan, update 
the plan. 
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Scrum Status Meetings 

A daily, stand-up meeting is normally associated with XP, but it is not explic- 
itly called out as a practice or given much structure except that nobody can sit 
during the meeting; it should be short, and it should happen every day before 
developers start pair programming. The Hefss team has adopted a similar, 
but more structured status meeting format from another agile methodology, 
Scrum (ref. 9). 

A Scrum status meeting is held daily by an appointed “Scrum Master” and 
lasts no longer than 15 minutes. The meeting has an open attendance policy, 
but only team members are allowed to talk. The team members, in turn, 
succinctly report three things: what they did since the last meeting, what 
they will do by the next meeting, and what got in the way (impediments). 
Additional discussion during a Scrum is strictly limited to clarification-related 
questions and to note topics that will be discussed at a later time by interested 
parties. The Scrum master plays the role of gatekeeper and takes notes. Later, 
the Scrum master compares performance with past commitments and follows 
up on situations that appear to be stalled. Most importantly, the Scrum 
master is responsible for removing impediments. 

Scrum status meetings have several benefits from a management perspec- 
tive. They offer a quick and easy mechanism to collect data for status reports 
and yield an immediate sense of whether a team is in trouble. By using Scrums 
to their benefit, management can avoid what Peopleware (ref. 13) claims is the 
ultimate management sin: wasting people’s time. 

Since the HEFSS team is currently dispersed throughout the local campus 
and most developers are not full time, the Scrum status meeting is only held 
weekly. In addition, the team also allots some time afterward to address any 
topics which may have arisen during the Scrum. This post-Scrum gathering is 
governed by Open Space’s Law of Two Feet, 14 which states that if during the 
course of any gathering, persons find themselves in a situation in which they 
are neither learning nor contributing, they must use their two feet and go to 
some more productive space. 


Other Communication Mechanisms 

Since communication and cooperation are essential to the success of the ef- 
fort, several additional tools are employed in addition to the communication 
mechanisms implicit in XP. The first is a Majordomo-based electronic mail- 
ing list which serves to facilitate communication among team members that 
are distributed across the local campus. In addition to the e-mail list and 
weekly meetings, the team also uses a web-based collaborative tool known as 

14 See www.openspacewoiid.com for more discussion. 
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a Wiki . 15 A Wiki allows users to freely create and edit web page content by 
using any web browser. Wikis have a simple text syntax for creating web page 
elements, and they dynamically create a new web page when they encounter a 
CamelCased word (a mixed-case word containing at least two capitals). The 
team uses the Wiki for a number of purposes. For example, the testing sta- 
tus page is contained in the Wiki so that anyone can add new data to the 
page. The Wiki is also used to share data for emerging test cases that have 
yet to be incorporated into the automated testing system, and it also serves 
as a repository for otherwise tacit knowledge, for example, CompilerNotes and 
CreatingNewTestCases. 


Documentation 

Documentation for the HEFSS code takes many forms. While currently the 
Hefss code itself lacks a formal user manual , 16 it does have a more exacting 
form of documentation, a large set of regression test cases. Each test case 
directory contains everything needed to run a given type of case and can 
usually be readily adapted to a new type of case. 

Meanwhile, developers have three tools available for browsing the HEFSS 
code base. Code browsing can take many forms and be done for various 
reasons; consolidating them into a single tool has so far proven to be an elusive 
goal. 

The simplest tool is a web-based rendering of the CVS repository, generated 
on-the-fly by the open source ViewCVS 17 tool. This approach is based on the 
CVS repository’s hie directory structure and thus lacks the ability to navigate 
the source by using internal structure. However, it is the only tool that readily 
provides access to prior versions of the source code. 

A second tool, developed by a support service contractor using C++, parses 
the source code and generates web-based output by using a commercial tool, 
Understand for Fortran , 18 which extracts calling tree graphs and code 
statistics. The C++ code also creates tables of variable declarations and 
renders comments associated with routines that are placed according to the 
coding standard. The web pages generated by this tool include source code 
listings that have been formatted with line numbers and are keyword-colored 
to enhance readability. 

A third code-browsing opportunity leverages the open source code doc- 
umentation system, RDoc , 19 which was originally intended for documenting 
Ruby source. A short extension for this system was written to parse and 

1 5 www . wiki.org 

16 The user manual is being written. 

17 viewcvs. sourceforge.net 

18 www. scitools.com/uf.html 

19 rdoc. sourceforge.net 
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format Fortran 95 20 and has subsequently been accepted into the RDoc dis- 
tribution. The RDoc system extracts a graph of the code source based on 
hies, modules, and routines. From these data, it can generate frame-based 
web pages, XML, or Windows help hies that can be used to navigate the 
calling structure. 


Experience Adding New Functionality 

This section presents a few examples of new functionalities that have been 
incorporated into the code base. These additions have been facilitated by 
the current software development process. None of these extensions had been 
explicitly planned for during the initial code development, and the ease of their 
inclusion is testament to the agility of the process. 

Time- Accurate Simulations 

In support of both passive and active how control research at Langley, the 
perfect-gas capabilities in the solver have been extended to higher order tem- 
poral accuracy. The validity of the approach has been verihed through numer- 
ical experiments in which an order property consistent with a second-order 
scheme has been demonstrated for turbulent hows. With a trivial amount 
of effort, the modifications required to obtain these results in the perfect-gas 
realm were extended to include reacting-gas simulations. Current work focuses 
on evaluating third- and fourth-order time-integration schemes for perfect-gas 
hows (ref. 24), which should also be readily extendable to more complicated 
physical models as needed. 

Incorporating Multiple Element Types 

Initially, the HEFSS solver made sole use of tetrahedral element types to dis- 
cretize a given domain. However, the ability to accommodate additional cl- 
ement types such as prisms, hexahedra, and pyramids provides greater flex- 
ibility to match a given element type to a particular how topology, and the 
extension to include such elements in all aspects of the package is currently 
ongoing. This effort represents one, if not the, most substantial modifications 
to the software to date because it extends the fundamental data structure 
used throughout the code base. The pre-/post-processor and solvers, as well as 
all of their associated linearizations for optimization and adaptation, require 
considerable modification at the most fundamental levels. This undertaking 
has revealed many areas in which additional refactoring is still required before 
an acceptable level of modularity is achieved. 

20 This extension was accomplished with only 120 lines of code. 
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Two-Dimensional Capability 


A major advantage of pursuing mixed-element discretizations is the ability 
to recover a truly two-dimensional solution capability, which can be achieved 
through the use of prismatic and hexahedral elements in the spanwise direc- 
tion, such that flux balances need only be performed in the plane of symme- 
try. Axisymmetric flows can also be readily accommodated by adding source 
terms. The benefits of such an approach are substantial, in that a separate 
code need not be maintained for such problems, a longtime burden for the 
original FUN2D/3D developers. In addition, all algorithms and physical mod- 
els available in the three-dimensional path are immediately available for two- 
dimensional solutions, which allows basic research to be carried out on less 
costly two-dimensional problems. When computations are extended to three 
dimensions, the inconsistencies normally associated with switching between 
two separate solvers are no longer an issue, and the results are not contami- 
nated by differences in discretizations or solution methods. 


Multigrid Algorithms 

A major thrust of the FAAST project is aimed at achieving textbook multigrid 
efficiency (TME), an effort that could drastically reduce solution times for 
complex problems (ref. 25). Since the baseline unstructured-grid solver used 
as the foundation for the current work did not include options for multigrid 
acceleration, much work has focused on implementing such a capability. 

The use of an agglomeration multigrid algorithm relies on an edge-based 
discretization of the governing equations; this requirement precludes the abil- 
ity to compute solutions to the full Navier-Stokes equations on mixed-element 
grids. For this reason, a geometric non-nested multigrid approach has been 
initially chosen for the HEFSS solver. Operations such as coarse-grid partition- 
ing and intergrid transfers in a complex domain-decomposed environment have 
been developed, and a multigrid algorithm has been implemented. Although 
this capability has been coded primarily with perfect-gas applications in mind, 
the scheme has been implemented such that users performing reacting-gas com- 
putations will also be able to make immediate use of this research without the 
need to duplicate extensive low-level code development typically associated 
with geometric multigrid on domain-decomposed unstructured meshes. 

One component necessary to achieve TME is a line-implicit solver to over- 
come stiffness associated with high-aspect ratio grid elements. The ability to 
form lines suitable for implicit relaxation, to obtain an appropriate partition- 
ing, and to perform an exact inversion along each line has been developed and 
is applicable to any set of physical equations being solved (ref. 26). 
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Incorporating High-Energy Physics 

The thermochemical nonequilibrium models in Hefss are identical to those 
in LAURA, but their implementation is substantially different. LAURA made 
extensive use of precompiler directives that allocated memory and defined the 
code path according to a diverse set of options. This compilation strategy 
evolved from an absence of dynamic memory allocation capability in FOR- 
TRAN when Laura was originally coded and because of a desire to completely 
eliminate any model-dependent conditional statements within loops that could 
compromise vector efficiency. Any change in the gas model required a recom- 
pilation of the source code. Laura employs a script to guide a user through 
the various permutations and combinations of options, but the process is bur- 
densome to a user conducting parametric studies. In contrast, HEFSS only 
needs to be compiled once on any platform, regardless of the desired physics 
model options. 

Model parameters in Laura are initialized in block data routines; these 
routines have been replaced by formatted data files that use conventional for- 
matted reads and namelists in the Hefss solver. Model parameters that are 
unlikely to be changed by the user (thermodynamic curve fit constants, species 
molecular weights, and heats of formation) are assembled in one set of data 
files. Gas model options that are likely to be changed by the user on a frequent 
basis, such as the chemical composition of the gases entering the domain or 
the thermochemical model, are assembled in a separate file. This separation 
minimizes the amount of setup required to perform a given analysis. 

Adjoint Solver and Sensitivity Analysis 

As important as the software practices in this effort are to the development 
of new analysis capabilities, they are absolutely critical to the success of the 
design element under FAAST. In references 27, 28, 26, a discrete adjoint ca- 
pability has been developed for the solver. This effort represents the only 
capability of its kind and relics on several hundred thousand lines of exact 
hand-differentiated linearizations of the preprocessor, flow solver, and mesh 
movement codes with respect to both the dependent variables and the grid 
coordinates. For free-stream conditions of Mach 0.84, a 3.06° angle of attack, 
and 5 million Reynolds number, sensitivity derivatives of the lift and drag co- 
efficients, with respect to several shape design variables for fully turbulent flow 
over an Onera M6 wing, (ref. 29) that were computed by using the discrete 
adjoint formulation, are shown in table 3. The adjoint results are in excellent 
agreement with those obtained using a complex- variable approach (ref. 30) 
with a step size of lxlO~ 30 . This accuracy can easily be compromised by a 
single error anywhere in the source code. With a dozen researchers modifying 
code on a daily basis, the use of continuous integration and automated testing 
is critical in maintaining such accuracy. Just as residual and force convergence 
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Tabic 3. Comparison of Discrete Adjoint and Complex Variable Design Vari- 
able Derivatives for Coefficients of Lift and Drag for Fully Turbulent Flow 
Over an Onera M6 Wing 



Camber 

Thickness 

Twist 

Shear 


C L 

0.956208938269467 

0.956208938269046 

-0.384940321071468 

-0.384940321071742 

-0.010625997076936 

-0.010625997076937 

-0.005505627646872 

-0.005505627647001 

adjoint 

complex 

C d 

0.027595818243822 

0.027595818243811 

0.035539494383655 

0.035539494383619 

-0.000939653505699 

-0.000939653505699 

-0.000389373578383 

-0.000389373578412 

adjoint 

complex 


histories are monitored to machine accuracy for the flow solver on several ar- 
chitectures, similar quantities are constantly tested for the adjoint solver and 
gradient evaluation codes. This constant testing ensures that discrete consis- 
tency between the analysis and design tools is always maintained, regardless 
of the modifications being implemented in other parts of the software. 

Similar to the continuous integration and testing performed for the hand- 
differentiated code, a Ruby code has been developed similar to the effort de- 
scribed in reference 31 to automatically convert the codes in the Hefss suite 
to a complex-variable formulation. This capability can immediately recover a 
forward mode of differentiation for the entire solver at any time, with no user 
intervention. This procedure is also continuously tested. 


Design Optimization 

Approximation and Model Management Optimization (AMMO) techniques 
(refs. 32, 33, and 34) have been recently added to the Hefss software set. 
AMMO is a methodology aimed at maximizing the use of low-fidelity models 
in iterative procedures with occasional but systematic recourse to higher fi- 
delity models for monitoring the progress of the algorithm. In current demon- 
strations, AMMO has exhibited from three to five-fold savings in terms of 
high-fidelity simulations on aerodynamic optimization of 3D wings and mul- 
tielement airfoils, where simplified physics models (e.g., Euler) computed on 
coarse grids serve as low-fidelity models, while more accurate models (e.g., 
Navier- Stokes) computed on finer grids serve as high-fidelity models. AMMO 
was the first approach for using variable-fidelity models analytically guaran- 
teed to converge to high-fidelity answers. 

Because AMMO relies on using a variety of models in a single optimiza- 
tion run, maintaining continuous integration and consistency with the en- 
tire software set is especially crucial for obtaining stable optimization results. 
However, designing a testing strategy for optimization presents an interesting 
challenge because optimization algorithms require reasonably well converged 
analyses and are, therefore, expensive. Procedures for automated testing of 
optimization software is currently under development. 
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Output Error Correction and Grid Adaptation 

One thrust of the FAAST program is to develop a mathematically rigorous 
methodology to adapt a grid discretization to directly improve the calcula- 
tion of an output function. An adjoint-based error correction and adaptation 
scheme has produced excellent results in two dimensions (ref. 35). This scheme 
is being extended to three dimensions and incorporated into HEFSS (ref. 5). 
This error correction and adaptation scheme requires the calculation of flow 
and adjoint residuals on embedded grids with interpolated solutions. The 
modularity of the HEFSS reconstruction, flux, and adjoint routines facilitated 
this calculation. 

The interpolation of the solution onto the embedded grid requires the cal- 
culation of least-squares gradients. This gradient routine was readily shared 
between the flow and adjoint codes. The element-based interpolation scheme 
was developed test-first with the F95UNIT framework. The code to compute 
the flow and adjoint residuals consists of only a small driver routine; the re- 
mainder of the code is reused from the flow and adjoint solvers. The anisotropic 
adaptation metric is calculated with code that was also developed test-first by 
using the F95UNIT framework. 


Concluding Remarks 

The Fast Adaptive AeroSpace Tools (FAAST) team uses techniques from the 
arena of commercial software development to implement an agile process for 
managing a development effort on a production software tool set. The agile 
aspects, such as collective ownership, simple design, and the lack of change 
boards, enable the rapid development of previously unplanned-for functional- 
ity. At the same time, the rigorous aspects, such as continuous integration 
and testing, maintain a stable base for the existing functionality. 

An additional benefit of the present software process is that since there 
is only one code base for all of the development efforts, advances in one area 
of functionality immediately become part of the mainstream capability and 
are thus readily available to other researchers and users. For example, time- 
accuracy enhancements developed in the context of perfect gas flows were 
easily extended to apply for chemically reacting flows. 

One remarkable aspect of this project is that developers who previously 
shuddered at the word “process” gelled into a team that uses a fairly rigorous, 
pervasive software process that they enjoy. 
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Colophon 

This paper is typeset in Donald Knuth’s Computer Modern Font with the free, 
multiplatform DTgX 21 typesetting system using the NASA class developed by 
Wood and Kleb. 22 The auxiliary DTgX packages are array, dcolumn, fancyvrb, 
multirow, rcsinfo, tabularx, textcomp, time, url, varioref, and xspace. 
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Appendix A 

Code History and Architecture 


This appendix provides a detailed review of the Hefss code history and 
its architecture. The first section contains an explanation of how the baseline 
code, Fun3D, was selected, followed by a section describing how Fortran 
95 was selected as the programming language. These reviews are followed 
by two sections that cover porting Fun3D to Fortran 95 and the use of 
object-oriented design principles. 

Baseline Code Selection 

Three Langley unstructured-grid codes, Usm3D, Fun3D, and Felisa (ref. 36), 
were considered as the initial template for the Hefss code. Felisa, an in- 
viscid, unstructured-grid flow solver, already has considerable success in the 
hypersonic domain. It also has equilibrium and thermochemical nonequi- 
librium gas models. While the addition of thermochemical nonequilibrium 
source terms, thermodynamic models, and transport models was perceived to 
be straightforward, considerable effort would have been required to introduce 
the viscous terms, the viscous flux Jacobians, and an implicit solution scheme. 
Both Usm3D and Fun3D are highly successful codes for computing viscous 
flow on unstructured grids within the subsonic to low supersonic speed regimes. 
Ultimately, Fun3D was selected because it is more robust in the hypersonic 
domain, which is apparently attributable to its combination of Roe Flux Dif- 
ference Splitting, flux reconstruction, and associated limiters. In addition, its 
discretizations are similar to Laura, and the discrete adjoint capability for 
perfect gas design (refs. 26, 27, and 28) and grid adaptation (refs. 5, 35, 37, 
38, 39, and 40) was judged particularly appealing for future hypersonic design 
and grid adaptation. A successful retrofitting of Fun2D with thermochemical 
nonequilibrium models confirmed the viability of this approach. 


Programming Language 

Most of the CFD codes developed at Langley are written in Fortran 77 and 
often rely on nonportable extensions such as vendor-specific functions or links 
with C code. For the current project, the team sought a single, unifying stan- 
dard language under which to develop new code. After surveying the available 
programming languages and deciding that a mixed-language code base would 
increase complexity too much, Fortran 95 was selected for the new suite 
of codes. Fortran 95 promises the numerical performance of Fortran 77 
with the advanced features of other languages, such as dynamic memory al- 
location, derived types, recursion, and modules. This choice also allows a 
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relatively straightforward conversion of a substantial legacy code base written 
in Fortran 77. 

The selection of FORTRAN 95 was tempered by the commitment to deliver 
a hypersonic flow simulation with thermochemical nonequilibrium on a geo- 
metrically simple configuration within 18 months. Adoption of a programming 
language significantly different from FORTRAN would have required a learning 
period for the majority of the team members, who were already proficient with 
Fortran 77. The time required to bring team members up to speed in a new 
language, plus the time required for conversion of legacy FORTRAN 77 to a 
language outside the Fortran family, was judged too costly, relative to the 
potential benefit offered by any other language. 

Fortran 95 training was tailored to team needs in a two-part workshop. 
Dan Nagle, from Purple Sage Computing Solutions, A1 spent a day with the 
team learning the Hefss code objectives and the architecture of the legacy 
code. Using this material, he prepared a two-day course which highlighted 
Fortran 95 features suited to the Hefss project. 

Auxiliary scripting for controlling code compilation, templating, and test- 
ing is performed with Ruby (refs. 41 and 42) and Make. A2 Ruby is an open 
source, object-oriented, threaded-scripting language with cross-platform sup- 
port, while Make is an open source compilation tool. 


Porting and Restructuring 

To lay a solid foundation for the new suite of solvers, Fun3D and the phys- 
ical models from Laura and Vulcan were ported from a mixture of C and 
Fortran 77 to Fortran 95. Porting Fortran 77 code to Fortran 95 
was initially thought to be a simple process that could be accommodated by 
using a combination of homegrown scripts and a commercial software package, 
FORESYS . A3 FORESYS was helpful when implicit none was requested be- 
cause it would automatically declare all variables used in the routine. It also 
provided instructive diagnostics for various classes of errors during the con- 
version process and when replacing common blocks by modules. However, 
it invariably reformatted lines and destroyed symmetric forms of equations 
that had been carefully introduced by earlier authors, and it repositioned or 
silently eliminated comments. Eventually, Ruby and Perl scripts were crafted 
to handle tedious, error-prone operations such as code indentation and the 
conversion of continuation symbols without losing the comments and other 
structured formatting. The remainder of the conversion was done manually. 

As the team had a chance to study the legacy structure, it became clear that 

A1 users .erols .com / dnagle 

A2 www . gnu . org /software /make /make . html 

A3 Foresys is a trademark of Connexite S.A., for more information see 

www . simulog . fr / is / 2fore 1 . ht m. 
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the old arrangements of common blocks and subroutines were counter to the 
modularity and extensibility the team was trying to create; so, during the port 
to Fortran 95, common routines and functions were extracted and placed 
in a single, shared library directory, while data structures such as boundary 
conditions, grid metrics, and solution quantities were generalized to handle an 
arbitrary number of equations and were encapsulated in derived types. 

The use of derived types provides additional flexibility over Fortran 77; 
however, early versions of Fortran 95 compilers often displayed a significant 
performance penalty when these constructs were used in the computationally 
intensive regions of the solver. A4 Consequently, the restructuring effort often 
required reworking these core routines to recover performance comparable to 
the legacy solver. 

This transformation took nearly a year and was not without difficulties, but 
it was definitely a worthwhile effort because it gave team members hands-on 
experience with a code most had never seen before, instead of merely accepting 
the results of an automatic conversion. The conversion process also gave the 
team an opportunity to create and tailor a coding standard A5 suited to their 
style and knowledge. In addition, the total lines of source code had been 
reduced by some 40 percent, in itself a significant benefit from the standpoint 
of code maintenance. 

Modularity and Encapsulation 

Modularization, along with abstraction, information hiding, and encapsula- 
tion, are also means used to enhance code maintainability and bring the addi- 
tional promises of code reuse, reduced complexity, extensibility, and orthogo- 
nality. A6 Abstraction is the process of picking out common features of objects 
or procedures and replacing them with a single, more general function. Infor- 
mation hiding reduces complexity by hiding details of an object or function 
so the developer can focus on the object without worry about the hidden de- 
tails. Encapsulation, or combining elements to create a larger entity, is one 
mechanism to achieve this goal. 

The Fortran 95 constructs of modules, interface statements, public and 
private declarations, and derived types were employed to implement these 
ideas. Fortran 95 modules are similar to the class construct in object- 
oriented languages, while derived types are akin to structures. Modules were 
designed to abstract types of operations (e.g., hie input/output, memory al- 
location, interprocessor communication, execution timing, linear algebra, and 
so on). Many modules employ a generic interface statement that automati- 
cally detects the type, kind, and rank of the calling arguments at compile time 

A4 See appendix C on page 34 for current results. 

A5 See appendix B on page 31. 

A6 In this case orthogonal is used in the sense of mutually independent or well separated. 
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and matches them to an appropriate low-level routine, which allows them to be 
largely independent of any particular flow solver since data are only exchanged 
through well-defined interfaces. Many of these Fortran 95 interface state- 
ments are produced automatically in the build process by a Ruby script which 
emulates the template system available in C++. In the remainder of this sec- 
tion, specific examples are given to demonstrate the benefits of modularization 
and data encapsulation. 

Memory allocation 

Array memory allocation is handled by a single interface statement in a module 
that automatically detects the type, kind, and rank of the argument and calls 
the appropriate low-level routine for the allocation and initialization. This 
abstraction streamlines memory allocation requests throughout the code since 
memory tracking and diagnostics can be placed and maintained in a single 
location. 

Parallel Communication 

Originally, the baseline solver relied on a shared-memory implementation spe- 
cific to SGI® hardware and was not portable to the increasingly popular 
cluster-based, distributed-memory computing platforms. Moreover, the com- 
munication operations were dispersed throughout the solver, and any modifi- 
cations to the communication model needed to be made in numerous locations 
throughout the code. In the current work, the message passing interface (MPI) 
standard was selected. Interprocessor communication has been abstracted 
from all but the lowest levels of the source code and is now encapsulated in a 
single module. 

With this centralized approach to MPI communication, it is now trivial to 
make sweeping changes to the parallel aspects of the code, including completely 
removing it to produce a sequential version of the code. This abstraction also 
benefited the team when the high-energy, reacting-gas portion of the code was 
parallelized successfully on the first attempt. Normally, a developer would 
expect to spend considerable time debugging interprocessor communication. 

Boundary Conditions 

Another area in which modularity and data encapsulation have provided a 
significant benefit is in the treatment of boundary conditions. The baseline 
Fun3D solver was extremely deficient in its ability to handle a wide range of 
boundary conditions. The user was restricted to inviscid, inflow/outflow, and 
viscous boundary types. Information required for these boundary types was 
contained in hard-coded data structures specific to each condition and were 
dispersed throughout the code. This design had become extremely limiting in 



recent applications and was clearly not sufficient for extension to high-energy 
flows, where a large array of boundary condition types are required. 

Using Fortran 95 derived types to encapsulate boundary condition infor- 
mation, the baseline solver was completely refactored to allow the straightfor- 
ward addition of new boundary types. For any given boundary condition, all 
necessary data are contained in a boundary condition type. An array of these 
derived types then constitutes all boundaries in a given problem. For bound- 
ary conditions requiring additional physical data, a link to an additional data 
structure specific to that boundary condition is encapsulated. Derived types 
also allow the additional enrichment of the data structure without modifying 
argument lists. In this manner, any number of different boundary groups can 
be efficiently handled at the higher levels of the solver and unrolled for use as 
needed. 

It should be noted that this data structure also allows for a natural han- 
dling of cost functions based on boundary data required for the design and 
grid adaptation capabilities within FAAST. Objective functions composed of 
viscous and/or pressure contributions can easily be specified on any subset or 
combination of boundary groups such that a specific flow feature or region of 
the domain can be targeted. For example, if it is determined that a strong 
shock on the outboard section of a wing is responsible for a severe wave drag 
penalty, a cost function can easily be formulated based solely on the contri- 
bution of that boundary group to the total drag. This method represents a 
substantial improvement over the baseline capabilities, in which all boundary 
groups necessarily contributed to a given cost function. 


Gas Physics 

Modules, interfaces, and derived types are used extensively for the gas phase 
physics modules, which include thermodynamics, transport properties, ther- 
mal relaxation, and chemical kinetics. The thermodynamics module contains 
the initial interface from the flow solver to gas phase physics. The trans- 
port property module interfaces with the flow solver and the thermodynamics 
module to define molecular viscosity, conductivity, and species diffusivities. 
The thermal relaxation module is engaged when populations of excited states 
(rotational, vibrational, and electronic modes) cannot be defined by a single 
temperature. This module provides the source terms that define energy ex- 
change among the available, thermally distinct modes. The chemical kinetics 
module provides source terms for the species continuity equations that define 
the rate of production or destruction of species. 

In conclusion, it should also be noted that because the Hefss project 
started with a large legacy code base and modularity and data encapsula- 
tion are elusive goals, which are really only earned through experience, code 
architecture changes are ongoing. In addition, there are drawbacks to modu- 
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larization that must be considered. For example, it was originally anticipated 
that compilers could optimize high-level constructs like derived types as if they 
were written using their lower-level counterparts. However, as appendix C on 
page 34 reveals, such is not always the case in practice. 
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Appendix B 
Coding Standard 


Note: bracketed numbers refer to line numbers in the sample program which follows. 

Style 

• Free format with no character past column 80 

• Indentation: begin in first column and recursively indent all subsequent blocks by 
two spaces. 

• Start all comments within body of code in first column [42]. 

• Use all lowercase characters; however, mixed-case may be used in comments and 
strings. 

• Align continuation ampersands within code blocks [77]. 

• No tab characters 

• Name ends [ss] . 

Comments 

• For cryptic variable names, state description using by a comment line immediately 
preceding declaration or on end of the declaration line [ 62 ]. 

• For subroutines, functions, and modules, insert a contiguous comment block imme- 
diately preceding declaration containing a brief overview followed by an optional 
detailed description [42]. 

Variable Declarations 

• Do not use Fortran intrinsic function names. 

• Avoid multiline variable declarations. 

• Declare intent on all dummy arguments [63]. 

• Declare the kind for all reals, including literal constants, using a kind definition 
module. 

• Declare dimension attribute for all nonscalars [63]. 

• Line up attributes within variable declaration blocks. 

• Any scalars used to define extent must be declared prior to use [so]. 

• Declare a variable name only once in a scope, including use module statements. 

Module Headers 

• Declare implicit none [35]. 

• Include a public character parameter containing the CVS $Id$ tag [37]. 

• Include a private statement and explicitly declare public attributes. 
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Subroutines and Functions 

• The first executable line should be continue [69]. 

• Use the only attribute on all use statements [ss] . 

• Keep use statements local, i.e., not in the module header. 

• Group all dummy argument declarations first, followed by local variable declarations. 

• All subroutines and functions must be contained within a module. 

• Any pointer passed to a subroutine or function must be allocated by at least size 1 
to avoid null or undefined pointers. 

Control Constructs 

• Name control constructs (e.g., do, if, case) which span a significant number of lines 
or form nested code blocks. 

• No numbered do- loops. 

• Name loops that contain cycle or exit statements. 

• Use cycle or exit rather than goto. 

• Use case statements with case defaults rather than if-constructs wherever possible. 

• Use F90-style relational symbols, e.g., >= rather than .ge. [73]. 

Miscellaneous 

• In the interest of efficient execution, consider avoiding: 

— assumed-shape arrays 

— derived types in low-level computationally intensive numerics 

— use modules for large segments of data 

• Remove unused variables. 

• Do not use common blocks or includes. 

Illustrative Example 


1 ! Define kinds to use for reals in one place 

2 

3 module kind_defs 

4 

5 implicit none 


6 


7 

8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 
21 
22 

23 

24 


module some_other_module 


end module kind_defs 


! A token module for demonstration purposes 


integer, parameter : : sp=selected_real_kind(P=6) ! single precision 
integer, parameter : : dp=selected_real_kind(P=15) ! double precision 


character (len=*) , parameter :: kind_def s_cvs_id = & 

’ $Id: cs_example.f90,v 1.6 2003/12/03 14:11:51 kleb Exp $’ 


implicit none 


character (len=*) , parameter : : some_other_module_cvs_id = & 
’ $Id: cs_example.f90,v 1.6 2003/12/03 14:11:51 kleb Exp $’ 


integer, parameter : : some_variable 
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25 

26 end module some_other_module 

27 

28 ! A collection of transformations which includes 

29 ! stretches, rotations, and shearing. This comment 

30 ! block will be associated with the module declaration 

31 ! immediately following. 

32 

33 module transformations 

34 

35 implicit none 

36 

37 character (len=*) , parameter : : transf ormations_module_cvs_id = & 

38 ’$Id: cs_example.f90,v 1.6 2003/12/03 14:11:51 kleb Exp $’ 

39 

40 contains 

41 

42 ! Computes a stretching transformation. 

43 ! 

44 ! This stretching is accomplished by moving 

45 ! things around and going into a lot of other details 

46 ! which would be described here and possibly even 

47 ! another "paragraph" following this. 

48 ! 

49 ! This contingous comment block will be associated with the 

50 ! subroutine or function declaration immediately following. 

51 ! It is intended to contain an initial section which gives 

52 ! a one or two sentence overview followed by one or more 

53 ! "paragraphs" which give a more detailed description. 

54 

55 subroutine stretch ( points, x, y, z ) 

56 

57 use kind_defs 

58 use some_other_module , only: some_variable 

59 

60 integer, intent (in) : : points 

61 

62 ! component to be transformed 

63 real (dp), dimension(points) , intent (in) : : x, y 

64 real (dp), dimension(points) , intent (out) :: z ! transformation result 

65 

66 external positive 

67 integer : : i 

68 

69 continue 

70 

71 i = 0 

72 

73 if ( x(l) > 0.0_dp ) then 

74 call positive ( points, x, y, z ) 

75 else 

76 do i = 1, points 

77 z(i) = x(i)*x(i) + 1.5_dp * ( real(i) + x(i) )**i & 

78 + ( y(i) * real(i) ) * ( x(i)**i + 2.0_dp ) & 

79 + 2.5_dp * real(i) + 148.2_dp * some_variable 

80 enddo 

81 endif 

82 

83 end subroutine stretch 

84 

85 end module transformations 
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Appendix C 

Fortran 95 Considerations 


The rationale for some elements of the coding standard presented in the 
previous section are discussed in this section. 


Best Practices 

The use of implicit none minimizes the possibility of variable type errors. 
An example of a type error is when the implicit FORTRAN integer typing 
scheme creates integers for variable names beginning with the letters “i” through 
“n” when the user had intended a real variable. This unintended declaration 
type is avoided because implicit none requires every variable to be declared 
explicitly. 

The use of only 01 prevents unintended changes to values of other vari- 
ables in the inherited modules. The only statement also facilitates finding 
the module that provides the inherited variable. To further restrict access 
to variables or subroutines in modules, a private statement is to be placed 
at the top of the module. An exclusive and explicit list of public entities 
is therefore required to share module data and methods outside the module. 
This exclusivity prevents unintended variable modifications. 

Use of equality comparison with reals should be avoided because small, 
round-off errors may be present. The difference between the two variables 
is compared to an intrinsic function like tinyO to provide a more reliable 
comparison. 

In general, the use of the select case conditional construct is more effi- 
cient than using an if-elseif construct since if-elseif might require sev- 
eral condition evaluations, while the select case only contains one condition 
evaluation. The select case construct is analogous to the depreciated com- 
puted goto . 02 Also select case constructs convey control logic in clearer 
fashion and allow for cleaner error handling through the default case. 


Performance Considerations 

Throughout the FORTRAN 95 restructuring of the Fun 3D solver, several effi- 
ciency issues pertaining to advanced coding constructs were uncovered. Fea- 
tures such as derived types and modules are extremely attractive for commu- 
nicating data; however, it was found that current FORTRAN 95 compilers often 

cl For example, use aModule, only : aVariable 

C2 See groups. google. com/groups?threadm=9o7uhi%24pus%241%40eising.k-net.dk for fur- 
ther discussion. 
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failed to produce performance comparable to that of conventional FORTRAN 77 
constructs such as passing data through calling argument lists. 

Data Sharing With Modules 

An intermediate restructuring of Fun3D relied almost exclusively on the use 
of FORTRAN 95 modules. By eliminating virtually every argument list in 
the solver, an exceptionally clean code was obtained. However, in subsequent 
testing, this implementation was shown to be several times slower in execution 
speed than the legacy C/Fortran 77 solver. Upon closer inspection, it was 
found that the use of modules to communicate large segments of data can be 
extremely inefficient. To illustrate this degradation in performance, the test 
code included as appendix D on page 40 has been executed on a range of plat- 
forms and compilers as listed in table Cl. Here, data are communicated with a 


Table Cl. Compilers Used in Performance Study 


Vendor 

Options 

Release 

O/S 

Platform 

Absoft™ 

-03 -cpu:p6 

8.0-1 

Linux® 2.4.18 

Intel® P3 

Compaq® 

-arch ev67 
-fast -04 
-tune ev67 

Xl.l. 1-1684 

Linux® 2.4.2 

Alpha EV67 

HP® 

-03 

2.4 

HP-UX® B. 10.20 

HP® 9000 

IBM® 

-05 

7 

AIX® 3 

IBM® 7044 

Intel® 

-03 -ipo -wK 

7.1-008 

Linux® 2.4.18 

Intel® P3 

Lahey-Fujitsu 

— o2 — nwarn 
-static — nsav 
— ntrace 
— nchk -x - 

6.20a 

Linux® 2.4.18 

Intel® P3 

NAG® 

-04 4.2 

-Wc , -malign-double 
-ieee=full 
-unsharedf 95 

Linux® 2.4.18 

Intel® P3 

NA Software 

-fast 

2.2-1 

Linux® 2.4.18 

Intel® P3 

PGI® 

-fast 

4.1-1 

Linux® 2.4.18 

Intel® P3 

SGI® 

-02 

7.3.1.2m 

IRIX® 6.5 

SGI® R10000 

5M 

Sun 

-fast 

6.2-2 

SunOS™ 5.8 

Sun™ Blade 1000 


hie I/O routine, as well as a routine that performs a large amount of arbitrary 
floating-point manipulations. In addition to an array A passed through a tra- 
ditional argument list interface, an identical array B is also passed to and from 
the subroutines through the use of a FORTRAN 95 module. For this test, the 
extent of the arrays is 20M, a value on the order of that encountered in typical 
aerodynamic simulations. The results are normalized on the data obtained 
using the argument list model. As can be seen in table C2, use of the module 
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Table C2. Unformatted Disk I/O Using 20M Integers and 20M Reals 


Compiler Assumed size Module Derived type Assumed shape 


Absoft™ 

1.00 

1.00 

1.03 

1.04 

Compaq® 

1.00 

0.98 

6.47 

0.99 

IBM® 

1.00 

1.03 

1.01 

1.03 

Intel® 

1.00 

1.16 

1.14 

1.05 

Lahey/Fujitsu 

1.00 

6.05 

5.99 

1.04 

NAG® 

1.00 

1.02 

1.22 

1.03 

NA Software 

1.00 

0.99 

1.08 

1.01 

PCI® 

1.00 

0.98 

1.00 

0.98 

SGI® 

1.00 

31.91 

31.37 

34.80 

Q 5M 

Sun 

1.00 

0.98 

1.02 

0.98 

construct can incur severe penalties for unformatted disk 

I/O. The module 

interface is over 

30 times slower than the data transferred 

via a conventional 

argument list on 

an SGI®. For floating-point arithmetic, the module interface 

exhibits run times on the order of 20 percent higher than the computations 

using data brought in through ; 

an argument list, as shown 

in table C3. Due 

Table C3 

. Compute Work Using 20M Integers and 20M Reals 

Compiler 

Assumed size 

Module 

Derived type Assumed shape 

Absoft™ 

1.00 

1.16 

1.84 

1.20 

Compaq® 

1.00 

1.40 

1.47 

1.38 

IBM® 

1.00 

2.76 

2.76 

2.76 

Intel® 

1.00 

0.97 

0.98 

0.95 

Lahey/Fujitsu 

1.00 

1.07 

1.07 

1.02 

NAG® 

Aborted 




NA Software 

1.00 

0.95 

1.13 

0.92 

PGI® 

1.00 

1.96 

1.96 

0.94 

SGI® 

1.00 

1.10 

1.10 

1.07 

o 

Sun 

1.00 

1.42 

1.40 

1.07 


to this performance degradation, the module construct is employed sparingly 
in the Hefss solver as a means to share large data structures. Only small 
amounts of data such as free-stream quantities, algorithmic parameters, and 
turbulence modeling constants are shared through modules. 

Derived Types 

The baseline C/ Fortran 77 solver was also refactored to make extensive use 
of the Fortran 95 derived type construct. The derived type is very attractive 
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in the sense that a number of related quantities can be encapsulated in a single 
variable, yielding relatively short argument lists throughout the code. Using 
this paradigm, variables related to the computational grid are stored in a 
grid type; solution-related variables are located in a soln type, and so forth. 
When a low-level routine requires a fundamental piece of data such as the 
coordinates of a grid point i, the information can be extracted as grid°/ 0 x(i), 
grid%y(i), and grid°/„z(i). Arrays of derived types are also supported under 
FORTRAN 95, making the implementation of algorithms such as multigrid and 
multiple instances of quantities, such as boundary groups, straightforward. 


As in the case of modules, it was found that the use of derived types can 
also incur severe execution penalties. As shown in the last column of tables C2 
and C3, a similar test to the one described previously has been performed on 
an array C transferred as the component of a derived type variable. It can 
be seen in tables C2 and C3 that this coding idiom can yield execution times 
more than 30 times slower for unformatted disk I/O and nearly a factor of 
three slower for floating-point operations over the argument list model. 


The current Hefss solver uses derived types to encapsulate much of its 
data structures; however, the components of these types required by low-level 
routines are extracted at the calling level and are received as conventional 
scalars and arrays in the I/O- and compute-intensive portions of the code. 
This model allows simple argument lists at the higher levels of the code, while 
maintaining the performance of the baseline solver. From a developer’s point 
of view, derived types are one of the more useful enhancements of Fortran 95 
over Fortran 77. They allow the developer to string together variables in 
meaningful groups and treat them as a single entity when desired. The HEFSS 
code uses a number of derived types. For example, the grid derived type 
contains all the information needed for the specification of the discretized 
mesh — x,y,z values for each point in space, cell volumes, cell-face normals and 
areas, connectivity information, and so on. Any of this information is available 
with the simple construct grid°/ 0 variable, e.g., grid°/ 0 x. Derived types may 
also be concatenated, extending their usefulness. For example, the grid derived 
type in the Hefss code encompasses a boundary condition derived type that 
contains all the necessary data to impose boundary conditions — the physical 
condition (e.g., solid wall), the locations of points on the boundary, surface 
normals, and so forth. In addition, the definition of the derived type may 
be extended at a future date without affecting existing code. For example, 
adding a cell-face velocity for moving grid applications would involve a one- 
line addition to the type definition and would be completely transparent to 
sections of code not requiring this information. 
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Assumed-Shape Arrays 

As shown in tables C2 and C3 on page 36, some compilers treat arguments 
passed via assumed-shape arrays as poorly as they did derived types. Assumed- 
shape arrays can be noncontiguous, and thus interfacing to old Fortran 77 
routines may require data to be copied to form a contiguous data block. These 
data copies can cause a large increase in the total memory required to compute 
a flow solution for some compilers as compared to others. 

Memory Copies 

Occasionally, it is desirable to bring variables into a routine via argument lists 
rather than modules, as demonstrated in tables C2 and C3. ffowever, unex- 
pected behavior was detected on certain platform/compiler combinations when 
argument lists were combined with low-level module use. In these instances, 
the variables in the modules were not synchronized with the argument list 
variables. This synchronization issue was resolved when argument lists were 
used consistently throughout the subroutines that needed access to the data. 
It was eventually surmised that this problem was due to memory copies made 
by some compilers during a subroutine call. When that data copy was modi- 
fied, it was no longer synchronized with the original data stored in the module 
and accessed with use. Also, on return from the subroutine, the local copy of 
the data was used to overwrite the data stored in the module, possibly erasing 
any modifications of the original data while the copy existed. This behavior 
appears to be very compiler and application specific and very difficult to detect 
and instrument. 

Compilation Errors, Warnings, and Information 

The various compilers listed in table C3 generally have different sets of con- 
structs that deem errors or produce a warning or other information. Some 
of the compilers are generally more lenient or particular than others when 
it comes to the constructs that are accepted as valid code for compilation. 
The Fortran 95 code base has benefited from exposure to a large number 
of different compilers. The coding standard contains guidelines for promot- 
ing portability. This portability experience was gained by exposure to multi- 
ple compilers, which makes it important to build and test on many different 
architectures/compilers, and which also results in a code base that is very 
portable. 

Compiler Maturity 

In addition to the problems discussed with performance, errors have been found 
in a number of compilers. Some versions of the compilers have contained errors 
that have prevented them from successfully compiling HEFSS. Also, compiled 
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code will sometimes suffer run-time errors that are specific to the compiler 
or its version. Some compiler vendors have been very quick to respond to 
compiler bug reports, and others have ignored our requests for resolution of 
these errors. 
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Appendix D 

Array Storage Performance Study Code 


! $Id: test_array_storage_performance . f 90, v 1.2 2003/09/08 20:34:47 atkins Exp $ 

j 

! Preliminary stab at testing the relative performance of various 
! array types: through argument lists as assumed-size, assumed shape, 

! and derived type; and via module data. 

module kind_definitions 

integer, parameter :: iKind = selected_int_kind (r=8 ) 
integer, parameter :: rKind = selected_real_kind (p=15) 

end module kind_def initions 


module module_data 

use kind_def initions, only: iKind, rKind 
implicit none 

integer (iKind) , save :: module_length 

integer (iKind) , dimension (:) , pointer, save :: module_data_array_int 
real (rKind) , dimension (:) , pointer, save :: module_data_array_real 

end module module_data 


module type_definition 

use kind_def initions, only: iKind, rKind 
implicit none 
type derived 

integer (iKind) :: component_length 

integer (iKind) , dimension ( : ) , pointer : : component_array_int 

real (rKind) , dimension ( : ) , pointer : : component_array_real 

end type derived 

end module type_def inition 


module test_various_array_types 

use kind_definitions, only: iKind, rKind 
use type_def inition, only: derived 

use module_data, only: module_data_array_int, module_data_array_real, & 

module_length 

implicit none 


integer, save : : number_of_tests = 0 

integer, save : : assumed_size_time = 0 

integer, save : : assumed_shape_time = 0 

integer, save : : module_data_time = 0 

integer, save : : derived_type_time = 0 

integer, save : : deref_derived_type_time = 0 


contains 

subroutine reset_counters ( ) 


number_of_tests = 0 
assumed_size_time = 0 
assumed_shape_time = 0 
module_data_time = 0 
derived_type_time = 0 
deref_derived_type_time = 0 


end subroutine reset_counters 

subroutine read_array_types (logical_unit , array_size, & 

assumed_size_array_int, assumed_size_array_real, & 
assumed_shape_array_int, assumed_shape_array_real, & 
derived_type) 

integer, intent (in) : : logical_unit 

integer, intent (in) : : array_size 

integer (iKind) , dimension (array_size) , intent (inout) : : assumed_size_array_int 
real (rKind) , dimension (array_size) , intent (inout) : : assumed_size_array_real 
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integer (iKind) , dimension ( : ) , intent (inout) : : assumed_shape_array_int 
real (rKind) , dimension ( : ) , intent (inout) : : assumed_shape_array_real 

type (derived) , intent (inout ) :: derived_type 

integer (iKind) , dimension ( : ) , pointer : : dummy _array_int 

real (rKind) , dimension ( : ) , pointer : : dummy_array_real 

integer : : i 

integer : : start, finish 
integer : : size 

continue 

number_of_tests = number_of_tests + 1 

rewind (logical_unit) 
call system_clock (start ) 

read (logical_unit ) (assumed_size_array_int (i) , i=l, array_size) 
read (logical_unit ) (assumed_size_array_real (i) , i=l, array_size) 
call system_clock (finish) 

assumed_size_time = assumed_size_time + finish-start 

rewind (logical_unit) 
call sy st em_clock (start ) 

read (logical_unit ) (assumed_shape_array_int (i) , i=l, array_size) 
read (logical_unit ) (assumed_shape_array_real (i) , i=l, array_size) 
call system_clock (finish) 

assumed_shape_time = assumed_shape_time + finish-start 

rewind (logical_unit) 
call system_clock (start ) 
size = module_length 

read (logical_unit ) (module_data_array_int (i) , i=l,size) 
read (logical_unit ) (module_data_array_real (i) , i=l, size) 
call system_clock (finish) 

module_data_time = module_data_time + finish-start 

rewind (logical_unit) 

call system_clock (start ) 

size = derived_type%component_length 

read (logical_unit ) (derived_type%component_array_int (i) , i=l,size) 
read (logical_unit ) (derived_type%component_array_real (i) , i=l, size) 
call system_clock (finish) 

derived_type_time = derived_type_time + finish-start 

rewind (logical_unit) 
call system_clock (start ) 
size = derived_type%component_length 
dummy_array_int => derived_type%component_array_int 
dummy_array_real => derived_type%component_array_real 
read (logical_unit ) (dummy_array_int (i) , i=l,size) 
read (logical_unit ) (dummy_array_real (i) , i=l, size) 
call system_clock (finish) 

deref_derived_type_time = deref_derived_type_time + finish-start 

rewind (logical_unit ) 
call system_clock (start ) 

read (logical_unit ) (assumed_size_array_int (i) , i=l, array_size) 
read (logical_unit ) (assumed_size_array_real (i) , i=l, array_size) 
call system_clock (finish) 

assumed_size_time = assumed_size_time + finish-start 


end subroutine read_array_types 


subroutine work_with_array_types (array_size, & 

assumed_size_array_int, assumed_size_array_real, & 
assumed_shape_array_int, assumed_shape_array_real, & 
derived_type) 

integer, intent (in) : : array_size 

integer (iKind) , dimension (array_size) , intent (inout) : : assumed_size_array_int 
real (rKind) , dimension (array_size) , intent (inout) : : assumed_size_array_real 


integer (iKind) , dimension ( : ) , intent (inout) : : assumed_shape_array_int 
real (rKind) , dimension ( : ) , intent (inout) : : assumed_shape_array_real 


type (derived) , intent (inout ) 


derived_type 


integer (iKind) , dimension (:) , 
real (rKind) , dimension (:) , 


pointer 

pointer 


integer : : i, j 
integer : : start, finish 
integer : : size 


dummy_array_int 
dummy _a r r a y _r e a 1 


real (rKind) 


: : dot, sum, max_element 
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real (rKind) , dimension (:) , allocatable :: vector 


continue 

number_of_tests = number_of_tests + 1 

allocate ( vector (array_size) ) 

call random_n umber ( vector ) 

call system_clock (start ) 
do j = 1 , 5 

dot = 0 . 0_rKind 
do i = 1, array_size 

dot = dot + assumed_size_array_real (i) *vector (i) 
end do 

max_element = -huge (max_element ) 
do i = 1, array_size 

if ( abs (assumed_size_array_real (i) ) > abs (max_element ) ) & 
max_element = assumed_size_array_real (i) 
end do 

sum = 0 . 0_rKind 
do i = 1, array_size 

sum = sum + assumed_size_array_real (i) 
end do 

do i = 2, array_size 

assumed_size_array_real (i) = assumed_size_array_real (i) & 

+ assumed_size_array_real (i-1) 

end do 

do i = 1, array_size 

assumed_size_array_real (i) = assumed_size_array_real (i) & 

+ assumed_size_array_real (assumed_size_array_int (i) ) 
end do 
end do 

call system_clock (finish) 

assumed_size_time = assumed_size_time + finish-start 

call system_clock (start ) 
do j = 1 , 5 

dot = 0 . 0_rKind 
do i = 1, array_size 

dot = dot + assumed_shape_array_real (i) *vector (i) 
end do 

max_element = -huge (max_element ) 
do i = 1, array_size 

if ( abs (assumed_shape_array_real (i) ) > abs (max_element ) ) & 
max_element = assumed_shape_array_real (i) 
end do 

sum = 0 . 0_rKind 
do i = 1, array_size 

sum = sum + assumed_shape_array_real (i) 
end do 

do i = 2, array_size 

assumed_shape_array_real (i) = assumed_shape_array_real (i) & 

+ assumed_shape_array_real (i-1) 

end do 

do i = 1, array_size 

assumed_shape_array_real (i) = assumed_shape_array_real (i) & 

+ assumed_shape_array_real (assumed_shape_array_int (i) ) 
end do 
end do 

call system_clock (finish) 

assumed_shape_time = assumed_shape_time + finish-start 

call system_clock (start ) 
do j = 1 , 5 

dot = 0 . 0_rKind 
do i = 1, array_size 

dot = dot + module_data_array_real (i) *vector (i) 
end do 

max_element = -huge (max_element ) 
do i = 1, array_size 

if ( abs (module_data_array_real (i) ) > abs (max_element ) ) & 
max_element = module_data_array_real (i) 
end do 

sum = 0 . 0_rKind 
do i = 1, array_size 

sum = sum + module_data_array_real (i) 
end do 

do i = 2, array_size 

module_data_array_real (i) = module_data_array_real (i) & 

+ module_data_array_real (i-1) 

end do 

do i = 1, array_size 

module_data_array_real (i) = module_data_array_real (i) & 

+ module_data_array_real (module_data_array_int (i) ) 
end do 
end do 
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call system_clock (finish) 

module_data_time = module_data_time + finish-start 

call system_clock (start ) 
do j = 1 , 5 

dot = 0 . 0_rKind 
do i = 1, array_size 

dot = dot + derived_type%component_array_real (i) *vector (i) 
end do 

max_element = -huge (max_element ) 
do i = 1, array_size 

if ( abs (derived_type%component_array_real (i) ) > abs (max_element ) ) & 
max_element = derived_type%component_array_real (i) 
end do 

sum = 0 . 0_rKind 
do i = 1, array_size 

sum = sum + derived_type%component_array_real (i) 
end do 

do i = 2, array_size 

derived_type%component_array_real (i) & 

= derived_type%component_array_real (i) & 

+ derived_type%component_array_real (i-1) 
end do 

do i = 1, array_size 

derived_type%component_array_real (i) & 

= derived_type%component_array_real (i) & 

+ derived_type%component_array_real (derived_type%component_array_int ( 
end do 
end do 

call system_clock (finish) 

derived_type_time = derived_type_time + finish-start 

call system_clock (start ) 
size = derived_type%component_length 
dummy_array_int => derived_type%component_array_int 
dummy_array_real => derived_type%component_array_real 
do j = 1 , 5 

dot = 0 . 0_rKind 
do i = 1, size 

dot = dot + dummy_array_real (i) * vector (i) 
end do 

max_element = -huge (max_element ) 
do i = 1, size 

if ( abs (dummy_array_real (i) ) > abs (max_element) ) & 
max_element = dummy _array_real (i) 
end do 

sum = 0 . 0_rKind 
do i = 1, size 

sum = sum + dummy_array_real (i) 
end do 

do i = 2, size 

dummy _array_real (i) & 

= dummy _ar ray_real (i) & 

+ dummy _array_real (i-1) 
end do 

do i = 1, size 

dummy _array_real (i) & 

= dummy _array_real (i) & 

+ dummy_array_real (dummy_array_int (i) ) 
end do 
end do 

call system_clock (finish) 

deref_derived_type_time = deref_derived_type_time + finish-start 
end subroutine work_with_array_types 
end module test_various_array_types 


program test_array_storage_performance 

use kind_definitions, only: iKind, rKind 
use type_def inition, only: derived 

use module_data, only: module_data_array_int, module_data_array_real, & 

module_length 


use test_various_array_types, only: read_array_types, & 

work_with_array_types, & 
reset_counters, & 
assumed_size_time, & 
assumed_shape_time, & 
module_data_time, & 
derived_type_time, & 


deref_derived_type_time 

implicit none 

integer, parameter : : logical_unit = 1 
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20000 


integer, parameter : : number_of_runs = 10 
integer : : array_size 

integer (iKind) , dimension (:) , allocatable, target :: assumed_size_array_int 
real (rKind) , dimension (:) , allocatable, target :: assumed_size_array_real 

! integer (iKind) , dimension (:) , allocatable :: assumed_shape_array_int 
! real (rKind) , dimension (:) , allocatable :: assumed_shape_array_real 

type (derived) : : derived_type 

integer : : i, ii 

continue 

do ii=l, 3 

write (*,*) " timings for array length ",array_size 

allocate ( assumed_size_array_int (array_size) , & 
assumed_size_array_real (array_size) ) 

! allocate ( assumed_shape_array_int (array_size) , & 

! assumed_shape_array_real (array_size) ) 

module_length = array_size 

allocate ( module_data_array_int (array_size) , & 
module_data_array_real (array_size) ) 

! allocate ( derived_type%component_array_int (array_size) , & 

! derived_type%component_array_real (array_size) ) 

derived_type%component_length = array_size 

derived_type%component_array_int => assumed_size_array_int 
derived_type%component_array_real => assumed_size_array_real 


open (logical_unit , file='data', form=' unformatted' ) 

write (logical_unit ) ( (array_size-i+l) , i=l, array_size) 

write (logical_unit ) (1.0_rKind, i=l, array_size) 

do i = 1, number_of_runs 

call read_array_types ( logical_unit , array_size, & 

assumed_size_array_int , assumed_size_array_real, & 

derived_type%component_array_int , derived_type%component_array_real , & 

assumed_shape_array_int, assumed_shape_array_real, & 

derived_type%component_array_int , derived_type%component_array_real , & 

derived_type ) 

end do 

close (logical_unit ) 
write(*,*) '10 tests:' 

write (*,*) ' Assumed-size:', assumed_size_time 

write (*,*) ' Assumed-size:', assumed_size_time / real ( assumed_size_time) 

write (*,*) ' Assumed-shape:', assumed_shape_time / real ( assumed_size_time) 

write (*,*) ' Use Module:', module_data_time / real ( assumed_size_time) 

write (*,*) ' Derived type:', derived_type_time / real ( assumed_size_t ime) 

write (*,*) ' DeRef-Derived type:', deref_derived_type_time / real ( assumed_size_time) 

call reset_counters ( ) 

do i = 1, number_of_runs 

call work_with_array_types (array_size, & 

assumed_size_array_int , assumed_size_array_real, & 

derived_type%component_array_int , derived_type%component_array_real , & 

assumed_shape_array_int, assumed_shape_array_real, & 

derived_type%component_array_int , derived_type%component_array_real , & 

derived_type ) 


end do 







write (*, *) 

'Work tests:' 






write (*, *) 

' Assumed-size:' 

, assumed_size_time 





write (*, *) 

' Assumed-size:' 

, assumed_size_time 

/ 

real ( 

assumed_ 

_size_time) 

write (*, *) 

' Assumed-shape:' 

, assumed_shape_time 

/ 

real ( 

assumed_ 

_size_time) 

write (*, *) 

' Use Module:' 

, module_data_time 

/ 

real ( 

assumed_ 

_size_time) 

write (*, *) 

' Derived type:' 

, derived_type_time 

/ 

real ( 

assumed_ 

_size_time) 

write (*, *) 

' DeRef-Derived type:' 

, deref_derived_type_time 

/ 

real ( 

assumed_ 

_size_time) 


deallocate ( assumed_size_array_int , assumed_size_array_real ) 
deallocate ( assumed_shape_array_int , assumed_shape_array_real ) 

deallocate ( module_data_array_int , module_data_array_real) 

deallocate ( derived_type%component_array_int, & 

derived_type%component_array_real ) 
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array_size = array_size*10 
end do 

end program test_array_storage_perf ormance 
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